Adaptive Data Partition Using Probability Distribution

نویسندگان

  • Xipeng Shen
  • Chen Ding
چکیده

Many computing problems benefit from dynamic partition of data into smaller chunks with better parallelism and locality. Previous partition methods either have high overhead or only apply to uniformly distributed data. This paper presents a new partition method in sorting scenario based on probability distribution, an idea first studied by Janus and Lamagna in early 1980’s on a mainframe computer. Our new method makes three contributions. The first is a rigorous sampling method that ensures accurate estimate of the probability distribution. The second is an efficient implementation on modern machines. Finally, it uses probability distribution in parallel sorting. Experiments show 1030% improvement in partition balance and 20-70% reduction in partition overhead, compared to approaches popular on current systems. When used in the classical problem of sorting, the method not only saves parallel sorting time by 33-50%, but also outperforms by up to 30% the fastest sequential sorting methods from the recent literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LPKP: location-based probabilistic key pre-distribution scheme for large-scale wireless sensor networks using graph coloring

Communication security of wireless sensor networks is achieved using cryptographic keys assigned to the nodes. Due to resource constraints in such networks, random key pre-distribution schemes are of high interest. Although in most of these schemes no location information is considered, there are scenarios that location information can be obtained by nodes after their deployment. In this paper,...

متن کامل

An Adaptive Approach to Increase Accuracy of Forward Algorithm for Solving Evaluation Problems on Unstable Statistical Data Set

Nowadays, Hidden Markov models are extensively utilized for modeling stochastic processes. These models help researchers establish and implement the desired theoretical foundations using Markov algorithms such as Forward one. however, Using Stability hypothesis and the mean statistic for determining the values of Markov functions on unstable statistical data set has led to a significant reducti...

متن کامل

Adaptive-Filtering-Based Algorithm for Impulsive Noise Cancellation from ECG Signal

Suppression of noise and artifacts is a necessary step in biomedical data processing. Adaptive filtering is known as useful method to overcome this problem. Among various contaminants, there are some situations such as electrical activities of muscles contribute to impulsive noise. This paper deals with modeling real-life muscle noise with α-stable probability distribution and adaptive filterin...

متن کامل

Epsilon Entropy of Probability Distributions

This paper summarizes recent work on the theory of epsilon entropy for probability distributions on complete separable metric spaces. The theory was conceived [3] in order to have a framework for discussing the quality of data storage and transmission systems. The concept of data source was defined in [4] as a probabilistic metric space: a complete separable metric space together with a probabi...

متن کامل

Using Weibull probability distribution to calibrate prevailing wind applying in oil spill simulation

In the Persian Gulf, the major source of oil pollution is related to the transportation of tankers, offshore production and discharges by coastal refineries. The water dynamical field has been obtained using a new hydrodynamic model. Local wind is recognized as the principal driving force combining to the water dynamic field to determine oil drift on the sea surface. The Weibull probability dis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003